A Semantic Approach towards CWM-based ETL Processes

نویسندگان

  • Anh Duong
  • Hoang Thi
  • Binh Thanh Nguyen
چکیده

Nowadays, on the basis of a common standard for metadata representation and interchange mechanism in data warehouse environments, Common Warehouse Metamodel (CWM) – based ETL processes still has to face significant challenges in semantically and systematically integrating heterogeneous sources to data warehouse. In this context, we focus on proposing an ontology-based ETL framework for covering schema integration as well as semantic integration. In our approach, beside the schema-based semantics in CWM-compliant metamodels, semantic interoperability in ETL processes can be improved by means of an ontology-based foundation to better representation, and management of the underlying domain semantics. Furthermore, within the scope of this paper, a set of CWM-based modelling constructs driven by ontology for the definition of metadata required for ETL processes is defined, facilitating the extraction, transformation and loading of useful data from distributed and heterogeneous sources. Thus, the role of interconnecting CWM and semantic technologies in populating data warehousing systems with quality data and providing data warehouse an integrated and reconciled view of data is highlighted.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Requirements Analysis Method For Extracting-Transformation-Loading (Etl) In Data Warehouse Systems

The data warehouse (DW) system design involves several tasks such as defining the DW schemas and the ETL processes specifications, and these have been extensively studied and practiced for many years. The problems in heterogeneous data integration are still far from being resolved due to the complexity of ETL processes and the fundamental problems of data conflicts in information sharing enviro...

متن کامل

Towards a Matrix Based Approach for Analyzing the Impact of Change on ETL Processes

Extraction, Transformation and Loading (ETL) processes aim to extract data from data sources to targets, via a set of transformations. In many situations, an ETL process can be subject to changes for several reasons. For instance, data sources changes, new requirements and bug fixing. When changes happen, analyzing the impact of change is mandatory to avoid errors and mitigate the risk of break...

متن کامل

A semantic approach to ETL technologies

Data warehouse architectures rely on extraction, transformation and loading (ETL) processes for the creation of an updated, consistent and materialized view of a set of data sources. In this paper, we aim to support these processes by proposing a tool for the semi-automatic definition of inter-attribute semantic mappings and transformation functions. The tool is based on semantic analysis of th...

متن کامل

Ontology Development for ETL Process Design

The Extract, Transform, Load (ETL) process design is difficult to perform because of the ambiguity of user requirements and the complexity of data integration and transformation. Current studies have explored the ontology-based approach to overcome these limitations by reconciling the semantics of user requirements within the ETL process design for easy generation of the ETL process specificati...

متن کامل

ETL Ensembles for Chunking, NER and SRL

We present a new ensemble method that uses Entropy Guided Transformation Learning (ETL) as the base learner. The proposed approach, ETL Committee, combines the main ideas of Bagging and Random Subspaces. We also propose a strategy to include redundancy in transformation-based models. To evaluate the effectiveness of the ensemble method, we apply it to three Natural Language Processing tasks: Te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008